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Abstract 

I start by reviewing some basic properties of random graphs. I then consider the role of random 
walks in complex networks and show how they may be used to explain why so many long tailed 
distributions are found in real data sets. The key idea is that in many cases the process involves 
copying of properties of near neighbours in the network and this is a type of short random walk 
which in turn produce a natural preferential attachment mechanism. Applying this to networks of 
fixed size I show that copying and innovation are processes with special mathematical properties 
which include the ability to solve a simple model exactly for any parameter values and at any time. 
I finish by looking at variations of this basic model. 

1 Random Graphs and Random Walks 

In this paper we will focus on undirected graphs where the links between vertices have no values or 
directions associated with them. These restrictions can be relaxed in many of the situations examined 
here but it simplifies the discussion without losing the essential points. In general we will allow edges 
to start and end on the same vertex (tadpoles), and for multiple edges between vertices so Fig. Q] is a 
simple example. Our graphs or networks (the terms are used interchangeably here) consist then of N 
vertices and E edges. The number of edges attached to a vertex is the degree, denoted by k and the 
average degree is (k). The degree distribution n{k) is the number of vertices with degree k which when 
normalised gives p(k) = n(k)/N, the probability distribution function. In section |2] we will also use 
these quantities for a simple bipartite graph. Note that in most cases we imagine creating many copies 
of a network using some stochastic process. Thus the averages are often over these ensembles of graphs, 
not just over all the vertices. In particular in many cases we will actually be looking at the mean degree 
distribution and n{k) will be that obtained by averaging over such ensembles. 

We will often talk about an object being chosen 'randomly'. To be ore precise what is meant is that 
the object is chosen with a uniform probability distribution from the set of similar objects, and it should 
be obvious from the context what this set is. For instance if we choose a random vertex of a graph, what 
is meant is that the probability of choosing a given vertex is simply 1/N. 

1.1 Random Graphs 

The Classical or Erdos-Reyni Random Graphs may be defined in one of two ways. Either for every 
pair of distinct vertices add a single edge with probability p = E/N, otherwise no edge is added. 
Alternatively add E vertices between randomly chosen vertex pairs. No difference for large N when 
sparse 2E/N = (k) ~ O(l), similar to the difference between canonical and microcanonical ensembles 
in statistical mechanics, and then we find a Poisson degree distribution. 

Generalised random graphs are graphs which have a given degree distribution p(k) but which are 
otherwise fixed. These may be created with the Molloy-Reed construction l60l |6T) in which each vertex 
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Figure 1 : Example of the type of network considered. Edges have no directions or values but multiple 
edges and tadpoles may be allowed. Here N = 5, E = 6, the largest degree is k = 5, the average 
degree is (k) = 12/5, and the degree distribution is n(k = 0) = 1, n(k = 1) = 1, n(k = 3) = 2, 
n(k = 5) = 1, with n(k) = for remaining values of k. 



a is attached to k a stubs (half an edge), where k a is drawn from the given distribution p(k). Then pairs 
of stubs chosen at random (uniformily) are connected. Alternatively one may create a graph in any way 
one likes with the desired p(k) and then use Maslov-Sneppen rewiring [59] to randomise graph. Such 
generalised random graphs have a given p(k) but otherwise their properties are completely random. In 
particular the properties of all vertices are the same. For any given source vertex, the properties of 
neighbouring target vertices will be independent of properties of the source vertex. 

This means that random walks on generalised random graphs are particularly simple. However the 
existence of an edge does mean that degree distribution of neighbours is not simply p(k) because the 
higher the degree of a vertex the more likely you are to arrive at that vertex, given there is no correlation 
between vertices. Thus the probability that the neighbour of a vertex with degree ki has degree k n is 
given by 

k 

P{kn\ki) = jj^p(kn) ■ (1) 

Let us use this to find the length of random walks on random graphs. Suppose we follow a random 
walk where we never go back along the edge we just arrived on. Then for infinite graphs N — > oo our 
random walks always end if (k) < 2 since we must have arrives on one edge but this leaves 'less than 
one edge' to continue the walk, i.e. sometimes there will be an edge but sometimes not. This must also 
mean that we do not have an infinite sized component, no GCC (giant connected component). On the 
other hand walks never end and we do have a GCC if (k) > 2. The transition to a phase where the GCC 
exists is at z = 1 where Il63ll231l39ll60l 

z(t):= { ^-l = (E- l)F 2 (t). (2) 

In fact all global properties depend on same ratio of second and first moments, z ©, for instance GCC 
size, the component distribution, and average path lengths. 

Let us use the calculation of the average path length in generalised random graph to illustrate these 
ideas (following Fronczak et al. [39 ]). Let pij be the probability that a random walk in which one never 
returns along last step taken, starting at vertex i, passes through vertex j at least once after x steps. The 
number of different walks of length x from i to j if no loops is W(i, x) = k% (k n — l) x ■ The probability 
of not arriving at j on any one step is just 1 — kj/(2E). So the probability that a random walk does not 
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arrive at j after x steps is 



^=1-4 -expl-^- 1 }. (3) 



2^7 

The probability that walker first arrives after x steps is then simply pij(x — 1) — Pij(x) so the average 
path length from i to j is then 

oo oo 

kj = ^2x\pij(x - 1) -Pij(x)} = ^2pij(x) . (4) 

x=l x=0 

This gives the average path length between any two randomly chosen vertices as 

(0 = W) + - , IE « 0.5772 . (5) 

Calculations like this above work because of the lack of correlations between vertices in such random 
graphs and because for for large sparse graphs the graphs are basically trees with no loops. These can be 
reasonable approximations for many models and perhaps for a few real graphs too. Otherwise we may 
use generalised random graphs as a null model against which we can compare other networks. Often 
these calculations are exact for closely related Urn models (see later discussions). 



1.2 Random Walks 

Given the inspiration from the analysis of generalised random graphs and Eqn.[T]in particular, let us now 
consider random walks as a tool for networks of all types. Random Walks are the extreme alternative 
to the use of Shortest Paths. One would normally use random walks to discuss mean first passage time 
etc and these are related to eigenvalues/vectors a transfer matrix defining a Markovian diffusion process 
on the graph. They are used for calculations of generalised random graphs (as seen above), sampling 
graphs El @0l S3 |M mHOl EH, communit y detection 16911701 and the natural creation of scale-free 
networks [75l [33l [78l [79l [80l [T8ll . 

Consider an unbiased random walk, where one treats all edges as equal (including the one used to 
arrive at the current vertex). Used to sample networks, so as a tool to search the vertices of a network, 
then vertices are visited with probability roughly p v isit(&) ~ kp(k)/(2E). This means that hubs (large 
degree nodes) are found very quickly so the tail of the degree distribution may be easily estimated. Other 
biased walks are also possible but these do not share the same special properties e.g. can sample vertices 
equally if slowly 187115511521. 

When we think of random walks as a diffusion process then we define a transition matrix M in terms 
of the adjacency matrix, A. If we define Aij to be the edge value from vertex j to vertex i, then for an 
unbiased random walk we want the probability of moving from vertex j to vertex i to be the entry My 
where 

M t3 : ^ . (6) 

Kj 

If we suppose the number of random walkers at vertex i at time t is vi(t) then we have to solve the 
matrix equation v(t) = M(t)v(0), a simple Markov process. For an undirected graph (Ay = Aji) the 
solution is simple 

, N 

V i (t) = C l ^ + Y,Cn{K) t U ( r ) (7) 
n=2 
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where the n-th eigenvector of the Markovian matrix M is u\ associated with eigenvalue A n . Since the 
adjacency matrix has non-negative entries, if we assume the graph is connected then from the Perron- 
Frobenius theorem we know the eigenvectors are real and ordered as 1 = Ai > | A2 1 > ... > |Aj| > 
... > 0. The largest eigenvalue is equal to one, indicating a single stable long time solution specified 
by the first eigenvector which is ^,- n1 ^ = ki/(2E) for our undirected case. Note that adjacency ma- 
trix eigenvectors and eigenvalues have no obvious physical interpretation, nor do they have any simple 
relationship to those of the Markovian M matrix. 

Given the existence of such eigenvectors, it is natural to try some sort of spectral analysis. The 
solution © shows us that the eigenvectors can sometimes be interpreted as informing us about poorly 
connected regions ll27l . However they are generally a poor way of determining community structure. 
More useful are approximation schemes where one uses just a few eigenvectors associated with the 
largest eigenvalues to reduce the dimension of the matrix, M, from 0(N 2 ) to some approximate O(N) 
structure. This is the basis of Principal Component Analysis and Singular Value Decomposition (e.g. 
see |[22ll85Tn. 

One may also use these eigenvectors to provide a ranking. In this case one defines the ranking value 
of vertex i to be the i-th entry of the first eigenvector of some Markovian matrix. For our unbiased walk 
M we found this was imply the degree of an undirected graph. However one can easily considers more 
complicated random walks. For instance 

M i:j := (l-p v )^i +Pv l- (8) 

is equivalent to a walk where one follows a randomly chosen edge out of a vertex with probability 
(1 — p v ) but otherwise one jumps to a random vertex, essentially starting a new random walk. Such 
variations are the basis for PageRank used by Google lfl4l . 

So why is a random walk so useful? In all these examples we are exploiting the fact that a random 
walk probes global structure of network but uses only local information. This is computationally efficient 
for computer algorithms which are searching the whole graph. However it also exactly the same feature 
that is required in the real world by those creating or using network. A process involving only local 
information is much more likely to occur naturally i.e. no external influence needed. The author of a 
web page knows only a small neighbourhood of existing web page. So when adding a link to their web 
page an author will have surfed a local neighbourhood in a way that, when we average over the behaviour 
of many such authors, might be statistically indistinguishable from a random walk. This suggests that 
we might be able to use random walks to answer a much deeper question: Why do so many networks 
have long tailed degree distributions? 

1.3 Long Tails of Growing Networks 

Long tailed distributions are common in data sets, Fig. [2] shows two examples. This is equally true for 
networks, for example Fig. |3l where long tailed distributions indicate the presence of large hubs, vertices 
of high degree. Lattices, Small World networks [86], and classical random graphs have no hubs. Only 
a long tailed degree distribution has hubs such as a power law e.g. if n{k) ~ then with N = 10 6 
ik) = 4 the largest vertex will be of degree around 2500 while for a classical random graph it will be 
about 17. 

The standard model used to produce such long tailed distributions has a long history being discussed 
by Yule 11891 . Simon 117611771 [131 and Price 117111721 amongst others. However it was put into the context 
of networks by Barabasi and Albert [5] who suggested the following algorithm. Suppose at time t we 
add a new vertex to an existing graph. We then attach one end of each of (k) new edges to the new vertex 
and attach the other end to an existing vertex in the network. These existing vertices are chosen with 



4 



This Text's Word Frequency by Rank 



.the 




.of 

a 




to 




* + is 
and * + in 




network. + 


X 

\ 

\ 



Ranked US City Size 



Population 

Housing Units 
Total Area 

Water Area 

Land Area 

Population Density 
Housing Density 




Figure 2: Two classic examples of long tailed distributions in real data sets. On the left is a log-log plot 
of word frequency against rank for the review |[34| — "the" is the most common word, "network" is 
eighth. On the right is a log-log plot of the size of US "Metropolitan Areas" (measured in various ways) 
against rank, all data scaled relative to the largest and by a power of ten relative to the next curve (for 
visualisation purposes). Most show some evidence of a simple power law. 
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Figure 3: Two examples of networks possessing long tailed distributions. On the left is a plot (with and 
without logarithmic binning) of the degree distribution taken from a network derived from the publi- 
cations of the Physics department of Imperial College where the vertices are key words ll35ll . On the 
right is the distribution of the in-degree of the logged in user graph for data taken from the website 
gallery . f uture-i . com where users may publish pictures (taken from 0). 

preferential attachment, that is they are selected with probability k/(2E) so that the "Rich get Richer", 
vertices with many edges are favoured to gain more. The result is that after a relatively short time a long 
tail appears with an asymptotic form n(k) ~ k~ 3 . Subsequent work shows that one can produce such 
power law tails in many ways. By choosing to mix in some random attachment or by not adding a vertex 
at every time step, one may produce any power from two to infinity ll^l50l l6l l24l[T7ll48ll5Ti . Growth 
is not essential as one might use an appropriate Hamiltonian and some rewiring scheme [41 . Even the 
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Figure 4: Log-log plots of degree distributions p{k) for N = 10 6 degree networks generated by random 
walks started from a randomly chosen vertex, with one vertex and two edges added at each time step. 
In each graph, the results are shown for average walk lengths of zero (crosses), one (squares) and seven 
(circles) steps, with data averaged over 100 runs. In the left diagram, the walk length is fixed and for 
each added a new walk is started from a randomly chosen vertex. The algorithm used for the bottom 
figure has variable numbers of edges and variable walk length. Multiple edges are allowed but are rarely 
created. Taken from ll33ll . 

network picture is not needed as the older works of Yule, Simon and Price show. However, what is 
clear is that the attachment probability must be exactly linear in degree (for large degrees) otherwise 
non-power law degree distributions appear fl49l . This begs the question, why do so many real systems 
appear to be attaching new edges with perfect linear attachment probabilities? It is easy for a computer 
to generate the attachment probabilities of k/(2E) but in reality in most networks each node has only 
relatively limited and local information. That is the E in the normalisation of the Barabasi and Albert 
algorithm is unknown! 

The most natural solution to the frequent appearance of long tails in networks is to imitate the 
behaviour and knowledge available at a vertex in most problems, that is use only local information 
ll84l 1751 l33l l78l l79l l80l [181 So again imagine that we are adding a new vertex to an existing network 
and we attach this new vertex to (k) new edges. The other ends of these new edges we attach to 
existing vertices which are found by executing a random walk on the existing network. As we discussed 
in section 11.11 under fairly general circumstances the walk will arrive at a new vertex with probability 
proportional to the number of ways of arriving at that vertex, namely its degree, as expressed in (Q~|). Thus 
with the simplest algorithm, i.e. with only local knowledge used, we generate attachment probabilities 
proportional to the degree. Hence we find a long tailed power-law distribution. 

One might be concerned that one needs to make a long walk, say of order the average shortest 
distance between vertices or of order the graph diameter, before we get effective preferential attachment. 
In fact as FigJUand the more extensive results of |[33l show, as long as some of the walks are one step long 
then a power-law distribution is appears. Walks of length two or more in length produce very similar 
power laws. What this suggests is that it is that the important length scale is the degree correlation length 
and that this is often going to be less than one for many examples. 

In a similar way Fig. [5] illustrates that the average degree of the graph makes little difference to the 
power law tail provided it is bigger than two. The case of (k) = 2 produces a tree graph so is a special 
case but it actually has an even longer tail. 

So the random walk algorithm f75l l33l is extremely robust, producing power laws almost whatever 
one does. Different starting points for the walks, varying the length of the walks changing the number of 
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Figure 5: The degree distributions normalised against the appropriate large network solution Poo(k) for 
fixed number of edges E = 2 x 10 6 , with one vertex (e = 1) added at each time step (e = 1) but with 
the average degree ((k) = 2m) varied. Plotted against log 10 (A;m 1//2 ) to take account of large scale finite 
size effects since here the number of vertices is N oc 1/m. For random walks starting from a random 
vertex for every new edge, of fixed length I = 7 and averaged over 100 runs. Note that the tree graphs 
formed when m = 1 (squares) are the only ones showing a strong deviation from the expected cubic 
power law, but they still show good power law behaviour with a power of 7 2.0. Taken from (33] . 

edges added per vertex still gives power-law tails. The value of the power is not, however, universal so 
in that sense it does not behave like a critical exponent. Powers can easily vary by 10% or 20% from that 
expected from an equivalent Barabasi- Albert algorithm. Nevertheless this is a self-organised method in 
that the algorithm uses the structure of the graph to generate its own growth and this still drives it to a 
power law degree distribution. 

2 Copying 

There is another way of looking at the random walk process used to create a growing graph. That is the 
final step of a random walker links the penultimate vertex on its walk to one of its neighbours. When 
we add an edge from our new vertex to the final vertex in the walk, we might imagine that what we are 
actually doing is copying the choice made by the penultimate vertex. Thus if these edges were links 
between web pages (the vertices), we are saying that both the new web page and the existing web page 
think the target of their common edges is a web page worthy of note. Indeed this is the basis of the utility 
of Google's PageRank method which as we have seen is essentially based on a random walk of the web. 
We will now see how this concept of copying can be extended to a wider class of problems. 

2.1 A Simple Model of Cultural Transmission 

Let us focus on the idea of copying and look at a simple model of cultural transmission — the Copying 
Model QUI l32l I3T1 l36l l37l . Suppose we have a fixed population of E individuals, each of whom can 
choose one N 'artifacts'. These artifacts have no intrinsic benefit — they may be the breed of pedigree 
dog they own, the shoe style they wear, the name of a baby, style of pottery used by that person. At each 
time step, one person chosen with probability Hr updates their choice, that is they choose a new artifact 
with probability 1T4 . In fact we will focus on individuals chosen at random while their new artifact 
choice will be picked in one of two ways. They can copy the choice made by an individual (chosen 
at random). Alternatively they pick an artifact at random. In the second case, if there are very large 
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numbers of artifacts, N — > oo, then this will be the first time this type of artifact has been chosen so we 
can think of this process as innovation. Only after both of these choices are made is the actual network 
updated. Thus 

U R = - , U A = p r ^ + p p - , Pp + p r = 1 , {E > k > 0) . (9) 

We can represent this model rewiring of a bipartite network as shown in Fig. [6] In fact the study of 
networks of constant size has received relatively little attention despite the fact that many systems will 
eventually reach a constant size or at least change size only slowly. 



N Artifacts 




E Individuals 



Figure 6: The Copying Model as a bipartite graph has E 'individual' vertices, each with one edge. The 
other end of the edge is connected to one of N 'artifact' vertices. If the degree of an artifact vertex 
is k then this artifact has been 'chosen' by k distinct individuals. At each time step a single rewiring 
of the artifact end of one edge occurs. An individual is chosen (number 3 here) with probability Ur 
which gives us the source artifact (here B). At the same time the target artifact is chosen with probability 
Ua (here labelled A). After both choices have been made the rewiring is performed (here individual 3 
switches its edge from artifact B to A). 

This bipartite graph may seem to be a trivial network but a projection onto a graph of just the artifact 
vertices, see Fig. |7j is just an implementation of the Molloy-Reed projection l60l . Thus the Copying 
Model captures the degree distribution of a fixed size unipartite graph undergoing rewiring, which has 
been studied elsewhere in several ways lf8^ H31 l23l l26l l68l f87l l64l 1651 l66ll 



N Vertices 




(1,2) (3,4) 

E/2 Edges 



Figure 7: A Molloy and Reed [60] projection of the copying model bipartite graph onto an undirected 
unipartite graph. Here individual vertices numbered (2i) and (2i — 1) are paired to give the unipartite 
graph edge labelled (2i — 1, 2i). The rewiring event shown is the equivalent to that shown in Fig. [6]for 
the bipartite graph. 

This Copying Model may seem naive but it has been used on several different data sets: transmission 
of cultural artifacts such as pottery designs, dog breed and baby name popularity l6^1 l9l l43ll45l [8l [T0l . 
The copying and innovation are familiar in other contexts, as inheritance and mutation in diversity 
of genes HUE] or species fl9l HI 1551 1531 1521 1581 1551 l46l. or as inheritance and the effect of New 
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Immigrants on the distribution of family names in constant populations [90]. One may also relate this to 
models of language evolution l83l and to variants of the Minority Game [2j. 

There is also a close relationship between this copying model and other models of statistical physics 
models. This can be translated into the language of Urn models BTl l42l l65l l66l as shown in Fig. [U 
and is related to some variations of the Backgammon or Balls-In-Box models used for glasses f74l[TTTL 
simplicial gravity [12J and wealth distributions lPT6l . The closest zero range process ll28l |29~1 l73l to the 
copying model discussed here is the variant with a 'misanthrope' process on a fully connected geometry. 
Voter Models I571I8T1 . when played on complete graphs, are just the N = 2 limit of the case considered 
here. 




A B C D H N 

A/Urns 



Figure 8: The Urn model representation of the Copying Model. The rewiring event shown is the equiv- 
alent to that shown in Fig. [6] for the bipartite graph. 



2.2 Mean Degree Distribution 

The mean field approximation is very accurate for many models because of the low vertex correlations. 
However can the mean field equation ever be exact? The answer is yes but only for special attachment 
and removal probabilities. 

The evolution of the degree distribution is given by 

n(k,t + 1) - n(k,t) 
= n(k + l,t)IL R (k + l,t)(l-U A (k + l,t)) 
-n(k,t)U R {k,t)(l-U A (k,t)) 
-n(k,t)U A (k,t)(l-U R (k,t)) 
+n(k - 1, t)U A (k - 1, t) (1 - U R (k - 1, t)) , 

(E > k > 0) . (10) 

Note that the factors of (1 — n) are invariably ignored in the literature yet they are essential if we are to 
enforce the boundary condition that n(k) = if k > E. These (1 — n) terms take account of processes 
where the edge is removed and then reattached to the same artifact. 

It is implicit that we are taking an ensemble average over many runs of our system. Thus problems 
arise when we can deal with the normalisations of our probabilities. For instance if we have attach- 
ment or removal probabilities of the form {k" /zp) then the normalisation zp depends on the particular 
configuration n(k) of each contribution to the ensemble. That is in general we can factorise as needed: 

, „ . hP . , (jfe"n(M)) 
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The only two cases where the mean field approximation is exact, where we have equality in (fTTT ). is 
when (3 = or j3 = 1. This is because then the normalisations are invariants of the system, N and E 
respectively. The most general choice for II/j and II^ satisfying these criteria is the simple copying and 
innovation probabilities of (© and we will now restrict ourselves to this case. 

It turns out that one can then solve for the mean degree distribution exactly for any parameter value 
and any time. The best way is through the generating function G(z, t) 



G(z,t) :=^Vn(M) 



(12) 



fc=0 



which is like taking a discrete Mellin transform. The degree distribution and moments are then simple 
derivatives of the generating function 

d k G(z,t) 



n(k, t) 



d k z 
d n G{e x ,t 



2 = 



d n x 



x=0 



z±) G(z.l) 



(13) 
(14) 



2 = 1 



This turns our equation (flOl) into a differential equation for the generating function, 

b(l + a — c) „. ., 

^- [G(z, t + 1) — G(z, t)] 

(1-2) 

= z(l - z)G"(z, t) + [c - (a + b + l)z]G'(z, t) - abG(z, t) 



(15) 



where the G' and G" are single and double derivatives with respect to z. The constants a, b and c are 
given by, 

Pr 



a ——(k) , 
Pp 



-E 



P P P P 



(16) 



We can exploit the linearity by splitting the generating function into (E + 1) eigenfunctions G^ m \z) 
and eigenvalues X m (m = 0, 1, . . . , E): 

E E 
G( Z ,t) = Y, CmiXmYG^Hz) , GW(Z) != £V<>>(fc) . (17) 

m=0 k=0 

The initial conditions n(k, t = 0) fix the coefficients c m . The eigenfunctions satisfy 

z(l - z)G {m) "{z) + [c-(a + b+ l)z]G im) '{z) 



ab 



(Xr, 



1) 



1 - Z 



b(c-a-l) 



Q( m \z) 



. 



(18) 



Recognising that this equation is similar to the hypergeometric ODE we obtain our solution in terms of 
the Hypergeometric functions F = 2-F1 where 



G {m \z) 



with corresponding eigenvalues, 



(1 - z) m F(a + m,b + m;c; z) 



(19) 



E—m 



_ v T(a + m + l)T(b + m + l)T(c) j 
{ ' ^ T(a + m)T(b + m)T(c + l)(ll) 



1=0 



A, 



1 — m(m — 1 



P P 



m- 



Pr 



< m < E . 



E 2 "E : 

The eigenvalues satisfy X m > A m+ i except for p r = when Ao = Ai = 1. 
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(20) 



2.3 Exact Equilibrium Solution 



The properties of the hypergeometric function give us the mean degree distribution as a simple ratio of 
T functions: 



n(k) 



A 



A := N 



r (*+£<*>) r (g -£<*>-* 

r(fc + i) r(JE7 + i-fc) ' 
r (g(^-( fc ») r (£< fc )) r (f)' 



(21) 



(22) 



This is similar to the long time solution for growing networks ||49l 1501 |6j |24j [T7J, [48j |5T1 but the second 
fraction in (|2TI ) is only found for network rewiring with the correct master equation, i.e. when the factors 
of (1 — II) are included in ( flOl ). Only approximate solutions were known previous to ||30ll . 




Figure 9: The degree probability distribution function p(k) = n(k)/N forN = E= 100 and various 
p r . From top to bottom at k = 2 we have: p r = 1 (red crosses), p r = 10/ E (green circles), p r = 1/E 
(blue stars) and p r = 0.1/ E (magenta squares). For p r = 1/E the distribution is almost a pure inverse 
power law for all values of k. With p r = 1 we have a binomial distribution. Taken from [36]. 

For large degree the equilibrium behaviour splits into three regimes. With a reasonable amount of 
innovation, E^ 1 < p r < (1 + (A;)) -1 , the degree distribution is a power law with an exponential cutoff 

n(k) « k~ 7 e'^ k , 7 = 1 - -^L < i , ( = -]n(l-p r ) . (23) 

V 1 Pr) 

The slope 7 will be indistinguishable from one in data sets as if 7 <C 1 then the exponential cutoff scale 
is too small. This type of solutions is characteristic of a simple copying process in a network of 
fixed size so it is not surprising to see it appearing in other apparently more complex systems which 
have copying as part of their fundamental dynamics. For instance, power laws of one appear in network 
models of species ll52l l38l when the networks are of constant size, at least on average in the long time 
limit. In these models the copying and innovation processes are inheritance and mutation. Another 
example, this time from sociophysics |2J, will be discussed below. 

The second region is where randomness dominates, 1 > p r p r < (1 + so < O(l). The 

degree distribution starts to look more like the binomial distribution which is the limit at p r = 1. 

The last region appears when in one generation, the time taken to rewire most of the edges once, the 
edges are likely to be assigned using only the copying process. This occurs when E" 1 3> p r > 0. Now 
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the distribution turns up at k = E and we find almost all individuals are attached to a single artifact 
— we have a condensate or fixation. Again this is due solely to the second ratio of V functions in (|2TT) 
which is present only if the factors of (1 — IT) are included in (fTOb . 

Only in the E — > oo limit is the transition between condensate and non-condensate regimes a phase 
transition and this occurs at p r = 0. 

A special case of interest is when we look at just two artifacts, N = 2, so that (k) is as large as 
possible. With p r = we obtain the basic Voter model (see for example Il58ll8ll0 . which has been used 
as a simple model of language evolution [S3 ] . One question asked is the time for the model to come to 
a complete consensus, i.e. all 'voters' have made the same choice, a condensate in our language. Our 
results show that the time scale is set by T2 := 1/ ln(A2). A little randomness, < p r < E^ 1 leaves the 
consensus imperfect but still largely intact for very long periods of time. This consensus will still take 
0(E 2 ) rewirings to appear. However for p$ = (1 + [E/2))~ l S> p r > E~ x while we still get most 
voters choosing the same option but the time scale for the equilibrium to be reached drops to O(E) as 
p r is raised. Finally there is a transition at pjj, a Z2 symmetry breaking transition, to a region for large p r 
where there is no special consensus. An example of some mean degree distributions in the Voter model 
with randomness added is shown in Fig. [TOj These results are easily generalised to other large (k) cases 
of the Copying Model e.g. we find that in general p$ = (E + 1 + (k))^ 1 . 




20 40 60 80 100 

k (degree ) 



Figure 10: Mean degree distribution for the Voter model for E = 100 and various p r . The blue curve 
with the highest value at k = and k = 100 is for p r = 0.001 <C E^ 1 shows a large consensus or 
condensate. The horizontal black line is for p r = 0.019 « p$ and here there is a Z2 transition between 
graphs with a minimum at k = E/2 to those with a maximum. The green curve with lowest values at 
k = and k = 100 is for p r = 0.99, well in the regime where there is no consensus. Note that for 
N = 2 there is a symmetry p(k) = p(E — k). 



2.4 General Features of the Exact Solution 

The exact solution has several interesting properties. The eigenfunction numbered zero is the only one 
which is time independent, Ao = 1, so this eigenfunction corresponds to the unique equilibrium solution. 
The eigenfunction numbered one never contributes. The first moment (Ai 7^ 1) is constant as it is related 
to the ration (k) = E/N yet it depends only on the eigenfunctions zero and one. Since the later gives a 
time dependence its contribution must be zero, c\ = 0. Thus the slowest time dependence comes from 
m = 2 eigenfunction, setting the equilibration time scale to be 

T2 = _ 1/ln(A2)c3 ^ + ?(i_^. (24 ) 
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The best way to study the time dependence is not to look at the moments but to study Homogeneity 
Measures F n 

T(E + l-n) d n G(z,t) 



F n (t) 



T(E + 1) dz n 

E 



(25) 
1 



_ ^ k (fc-1) (k-n + 1) 

Trivially for n > E F n = but (k n ) ^ highlighting one simplification over the moments. It is also 
clear from the (z — l) m prefactor of the m-th eigenfunction ( fT9l ) that the n-th homogeneity measure gets 
contributions only from eigenfunctions numbered n and lower. The moments have a similar property as 
can be seen from the relationship between the m-th moments and the n-th Homogeneity Measures: 

n n 

F n := Nj2 S t } (k m ), N(k n ):=Y,&t ] F m (27) 

m=0 m=0 

where and &^ are Stirling numbers of the first and second kind respectively. The generating 
function may now be written as 

E E 

G(l + y,t) = ^y"(^F„(t) = ^(l + y) fc n(A ; ,t), (28) 



F n (t) 



n=0 v ' k=0 

d n G(z,t) 



d n z 



(29) 



i.e. the F n are the n-th coefficients of the Taylor expansion of G around (1 + y) = z = 1. 

Unlike the moments, the homogeneity measures have a simple physical interpretation as they are 
the probability that any n different individuals will have chosen the the same artifact. Thus if for all 
n < E we have F n = then no artifact has been chosen more than once while if all F n = 1 then all 
individuals attached to same artifact - a condensate. For instance the simplest measure of homogeneity 
of the system is F2, the probability that two different individuals have chosen the same artifact. This is 
given by 

F 2 {t) = F 2 (oo) + (A 2 )*(F 2 (0)-F 2 (oo)) , (30) 
1 +Pr ((k)-1) 

F2(oo) = 1 + ME-1) ' (31) 

where the initial conditions set F 2 (0). The accuracy of the full time dependence of our solutions can be 
seen in following these F n meaures as shown in Fig. QT] 

2.5 Following a phase transition in real time 

We have already noted that our model gives the time evolution of the degree distribution of a generalised 
random graph made up of the artifact vertices, as shown in Fig. [7J This graph undergoes a phase 
transition (e.g. appearance of GCC - Giant Connected Component) at ll63l l25l l39l l60l z{t) = 1 where z 
was defined in ([2]). This is simply related to ^(i) as 

z (t):=^-l = {E-l)F 2 [t). (32) 

Thus we now have an analytic handle on the phase transition which occurs when rewiring a unipartite 
graph [36], shown in Fig. (fT2l . In principle we can calculate the number of vertices in the GCC, the 
diameter and average shortest path length in the GCC from known formulae and these only require 
knowledge of F 2 (t). 
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rewirings x1() 4 rewirings x1 p4 

Figure 11: Results for various F n (t) from numerical runs (points) with the lines given by the exact 
formula. From top to bottom are: Fz{t) (crosses), F%{t) (circles), F±{t) (stars). For E = N = 100, 
p r = 0.01 and data points are the average of 10 5 runs of a simulation. Taken from iPTI 




Figure 12: Behaviour of a unipartite graph as a randomly end of a random edge is rewired at each 
time step, reattached to a vertex chosen with pure preferential attachment, p r = 0. Here the graph has 
N = E = 10 5 and starts from ^(0) = i.e. all vertices are degree one and are connected to one other 
vertex. Results are calculated for each instance and then averaged over a total of 1000 runs. Note finite 
size effects clearly visible as the transition is not perfectly sharp and it occurs at z = 1.06 ± 0.01. Taken 
from [36]. 



2.6 Adding a Network of Individuals 

So far in this model we have inserted the copying process by hand, just demanding that the attachment 
probabilites Ha have a term proportional to k 1 . However, in practice we want to see this emerge as a 
natural process involving only local information. Again the normalisation plays a key role since they 
contain global variables. However it is simple to use the same random walk idea of section 11.31 to 
generate the copying process in this model. To do this we now an Individual graph, that is a network 
with edges between just the individual vertices, as shown in Fig. [T3l 

The results suggest that in most cases the individual network has little effect on the equilibrium 
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Figure 13: The bipartite network of the Copying Model with an individual network added (shown below 
the individual vertices) and used to generate the copying mechanism. The rewiring event is the same 
as that shown in Fig. ©, here assumed to be a copying event. The target artifact is now found by first 
making a random walk on the individual network followed by a final step on the bipartite graph to reach 
the target artifact. In the case shown, the walk starts from from the randomly chosen individual 3 (its 
chosen artifact B is the source artifact) and we make a one step random walk on the individual graph to 
arrive at individual 1. We finish by walking from individual one to its chosen artifact A, which becomes 
the target artifact. Thus the event then consists of individual 3 copying the current choice of individual 
1. 

degree distribution 1361 as shown in Fig. Irffl 




k 



Figure 14: Equilibrium artifact degree distribution p(k) for different Individual graphs of 100 vertices 
and average degree 4: Erdos-Reyni (red pluses), Exponential (green circles), Barabasi-Albert (purple 
squares), periodic lattices of two (grey crosses) and one (blue diamonds) dimension. The line is the 
analytic result for a complete Individual graph while the other results are taken over an ensemble of 10 4 
Individual graphs. N = E = 100, p r = l/E. From 11361 . 

However when we look at the time dependence we see more sensitivity to the properties of the 

'The lattices used in Fig.sll4landll5lare periodic and cubic (Z d ) with the nearest neighbours connected except for one- 
dimension case where next-to-nearest neighbours are also connected. The Exponential and Barabasi-Albert graphs are con- 
nected Individual graphs with degree distributions of Pi n d(fc) oc exp{— (k} and Pi n d(fc) oc [k(k + l)(k + 2)] _1 respectively. 
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individual graph. We can now define a local measure of homogeneity, the average interface density, (p) . 
This is the probability that any two individual vertices which are connected by the Individual graph have 
a different artifact. As Fig^ [15] shows, the local and global homogeneity measures (p) and (1 — F 2 ) are 
close to the analytic resulul for large dimension lattices with short network distances. As we take lattices 
of smaller dimension, F 2 gets much larger than the analytic result, and (p) much smaller. Similar effects 
can be seen as we change p r ll36l . 




10° 10 1 10 2 10 3 10 4 

t/E 

Figure 15: Homogeneity measure F2 for various cubic lattices against t/E. The black solid line rep- 
resents the analytic 1 - F 2 (t) for N = 2, p r = l/E and E = 729. Numerical results for 1 - F 2 (t) 
(triangle highlights) are plotted for 1-d (red), 2-d (purple) and 3-d (blue) regular lattices. The average 
interface densities (p) are plotted as circles. Averaged over 1000 runs. From ll36l . 

These results are of relevance to many sociophysics models. Consider a variation of the Minority 
game [2] in which individual follow either their own strategy or that of a neighbour. Then the number 
of 'actors' (followers) using one particular strategy (that belonging to a leader) can be understood in 
terms of the Copying Model as this is the mean degree distribution with the strategies playing the roles 
of artifacts. Actors are copying their strategy or, given the inherent instability of the game, they flounder 
about in a way that is statistically indistinguishable from a random (innovation) process. It should be 
no surprise that this distribution is found to be [2] a power law with slope one and some cutoff, in 
agreement with (l23l) . Again this model emphasises the way in which copying is a natural process in 
which preferential attachment and thus power law degree distributions emerge naturally, much as was 
noted for growing networks in f75ll33l . 

2.7 Different Update Methods 

One can also make changes to the way we update the system. Suppose we first select X different 
individuals at each step, either randomly or in numerical sequence (first {1, 2,3,..., X} , then {(X + 1) 
mod E, (X + 2) mod E, . . . , (2X) mod E} etc.). These individuals make their new artifact choices 
at the same time but still no updates occur. Finally the system is updated simultaneously. The simple 
Copying Model of QUI l32l I3T1 l36l and discussed so far is the case of X = 1 with random selection. 
The models discussed in the context of cultural transmission l62l l9l l43l l45l l8l [TUl 171 choose X = 100 
where random and sequential updates are equivalent. What we find numerically is shown in Fig. [161 
When we update just one choice at each time step, X = 1, sequential updating reaches equilibrium 
faster than random update but the equilibrium values are the same, F 2 = 1/2 in Fig. [161 This is to be 

2 The analytic result equivalent to a complete individual graph with tadpoles, i.e. with adjacency matrix Aij — 1. 
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F2 sequential and randon update for varying a, H=E=1QB, pr=B.Blj. lB A 4runs 

a.G I 1 1 1 1 1 1 1 — 
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Figure 16: Plots of F 2 for E = N = 100 and p r = 0.01 for different update methods. From top to 
bottom after 2000 rewirings we have sequential X = 1 (red), random X = 1 (green), sequential X = 50 
(blue), random X = 50 (purple), and finally X = 100. Started from n(k) = ESki and averaged over 
10 4 runs. From 071 . 



expected as after E updates random updating has not updated all elements while sequential has. More 
surprising perhaps is the fact that for X = E/2 the time evolution is similar but sequential/random 
updates produce different equilibrium results. This is also a lower equilibrium than the original X = 1 
model of (TTOb produces. Finally we see that if we imitate [|62l [9j @3] @5] [8] [TO] 13 and update all choices 
simultaneously, X = E, then we get the lowest equilibrium result, F2 = 1/3 in Fig. [16] 

In fact it is possible to obtain analytic results for F n in the X = E case and we find that we still 
have the same form Fz(t) = ^(0) + ^^(.^(oo) — 1*2(0)) as we found for X = 1 random updating 
in (l30l but now we have 



Pp + (l-Pp)W 
pl + (l-pl)E 



F 2 (t = 0,X = 100) = P 2 - A _ P 2 ' , 03) 



1 



A 2 (X = 100) = pg(l__) . (34) 



For the parameter values in Fig. [[6]these formulae give the 1*2(00) values quoted above. 



2.8 Different Communities of Individuals 

Finally we can also look at a situation where we split the population into several communities. Individu- 
als in each community will then share the same copying and innovation probabilities, but now we are free 
to set different probabilities for copying which depend on the community of the individual being copied 
and on the community of the indivdual whose current choice is being copied. What we wish to monitor 
is the number of times the different communities have chosen an artifact. Each artifact has a degree, k a , 
indicating how many times individuals from community a have chosen that artifact. We therefore have 
to look at the mean degree distribution n({k a };t). So the first step is to choose from which community 
the individual to be updated will be chosen and this can be done with probability q a for community a. 
Once the source community a has been chosen, we then choose an individual at random from the E a 
individuals in that community and it the choice of this individual that we are going to change. This 
designates the source artifact which is about to lose an edge. Now we have to determine the new choice 
for our chosen individual, the target artifact. We can do this at random with some probability p ra , so that 
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communities can have different innovation rates. Alteratively the individual may decide to copy from 
an individual in some community [3 which it does with probability p pa p. In this process probability of 
attaching the edge being rewired, the choice made by an individual in community a, will be proportional 
to the degree kp of each artifact. For instance two extremes of behaviour would be when communities 
ignore the choices of other communities p pa p = S a /3 or at the other extreme where all communities 
copy the 'aspirational' choice made by a community 7 of 'leaders' so p pct p = <5g 7 . Within the limitation 
Pra + J2d Ppafd = 1 there is a wide range alternatives. For instance this might be suitable to model the 
choice of baby names which it has been suggested depends on the financial income of different groups 

EH. 

With C communities one finds a (C + 1) -dimensional PDE for the generating function of n({k a }; t) 
but it does not admit an simple solution. One may build solutions iteratively for F(a\, 02, ■ ■ ■ , a n ; t), 
homogeneity measures which express the probability finding various types of edge attached to the same 
artifact. Even then the parameter space is now too large for a simple analysis. For instance in the case 
of two communities 'X' and 'Y' (a = X, Y) shown in Fig. [I7]there are eight parameters which may be 
chosen to be: q x ,Ppxx,Ppxy,Ppyx,Ppyy, E x , E y and N. The simplest homogeneity measures are Fxx, 
Fxy and Fyy where Fxx i s the probability that two different X individuals (individuals in community 
X) have chosen the same artifact, Fxy is the probability that one randomly chosen X individual and 
one randomly chosen Y individual have chosen the same artifact, and so forth. These may be found 
analytically by finding the eigenvalues of a three-dimensional matrix but the details are lengthy and may 
be found in |[36ll . 

N Artifacts 




E Individuals £ Individuals 

X y 

community X community Y 



Figure 17: The copying model with two communities of individuals, community X (Y) containing E x 
(E y ) individuals indicated as blue circles (yellow diamonds) with black edges (green edges) connecting 
them to the artifact that individual has chosen. 



3 Conclusions 

We have seen that a random walk is a very natural tool for analysis of generalised random graphs and 
for the analysis of real data sets. However it is much more than just an analysis tool. Since a random 
walk can be performed using only local information it is also likely to be an important natural process in 
a wide variety of contexts. If the use of a random walk is to find new and potentially better information 
from a network, then even if the actors in the system are only able to do short range walks, even if they 
can only look at their neighbours, then we are finding target vertices in roughly in proportion to their 
degree. In the case of growing networks, this is the most natural way for preferential attachment to 
emerge and hence gives an explanation why so many power law degree distributions are found in data 
sets 17511551 

However we can take this a step further and imagine that having found a target vertex, we are likely 
to copy some property of the target. In the growing networks model (75l |55J this was the creation of a 
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new link to the target vertex by copying the target of an existing link, the end of the last edge followed in 
the random walk. Thus preferential attachment processes can be seen as emerging from local searches 
done to exploit the information stored in the network, so that individuals/actors at a node may optimise 
their situation by learning from the knowledge represented by network. 

The Copying Model is simplistic but because it captures such a basic and naturally emergent process 
— copying — on any sort of network, we should not be surprised to see it has such wide applicability. 
For instance limiting ourselves to networks of constant size, its prediction of simple 1/k power laws 
with exponential cutoffs for certain parameter ranges means we can understand such laws in terms of 
this process when they are found elsewhere. The simplest examples give us a rare example of an exactly 
solvable non-equilibrium process, known for any finite sized graph at for all times l30"ll32ll3Tl . However 
there are numerous extensions which may be needed for more realistic contexts where approximate 
analytical results may still be possible l36l . 
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